Efficient Approximation of Optimal Control for Markov Games

نویسندگان

  • Markus N. Rabe
  • Sven Schewe
  • Lijun Zhang
چکیده

We study the time-bounded reachability problem for continuous-time Markov decision processes (CTMDPs) and games (CTMGs). Existing techniques for this problem use discretisation techniques to break time into discrete intervals, and optimal control is approximated for each interval separately. Current techniques provide an accuracy of O(ε) on each interval, which leads to an infeasibly large number of intervals. We propose a sequence of approximations that achieve accuracies of O(ε), O(ε), and O(ε), that allow us to drastically reduce the number of intervals that are considered. For CTMDPs, the performance of the resulting algorithms is comparable to the heuristic approach given by Buckholz and Schulz [6], while also being theoretically justified. All of our results generalise to CTMGs, where our results yield the first practically implementable algorithms for this problem. We also provide positional strategies for both players that achieve similar error bounds.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Efficient Approximation of Optimal Control for Continuous-Time Markov Games

We study the time-bounded reachability problem for continuous-time Markov decision processes (CTMDPs) and games (CTMGs). Existing techniques for this problem use discretisation techniques to break time into discrete intervals of size ε, and optimal control is approximated for each interval separately. Current techniques provide an accuracy of O(ε2) on each interval, which leads to an infeasibly...

متن کامل

Value Function Approximation in Zero-Sum Markov Games

This paper investigates value function approximation in the context of zero-sum Markov games, which can be viewed as a generalization of the Markov decision process (MDP) framework to the two-agent case. We generalize error bounds from MDPs to Markov games and describe generalizations of reinforcement learning algorithms to Markov games. We present a generalization of the optimal stopping probl...

متن کامل

Optimal Time-Abstract Schedulers for CTMDPs and Markov Games

We study time-bounded reachability in continuous-time Markov decision processes for time-abstract scheduler classes. Such reachability problems play a paramount rôle in dependability analysis and the modelling of manufacturing and queueing systems. Consequently, their analysis has been studied intensively, and techniques for the approximation of optimal control are well understood. From a mathe...

متن کامل

Utilizing Generalized Learning Automata for Finding Optimal Policies in MMDPs

Multi agent Markov decision processes (MMDPs), as the generalization of Markov decision processes to the multi agent case, have long been used for modeling multi agent system and are used as a suitable framework for Multi agent Reinforcement Learning. In this paper, a generalized learning automata based algorithm for finding optimal policies in MMDP is proposed. In the proposed algorithm, MMDP ...

متن کامل

A Novel Successive Approximation Method for Solving a Class of Optimal Control Problems

This paper presents a successive approximation method (SAM) for solving a large class of optimal control problems. The proposed analytical-approximate method, successively solves the Two-Point Boundary Value Problem (TPBVP), obtained from the Pontryagin's Maximum Principle (PMP). The convergence of this method is proved and a control design algorithm with low computational complexity is present...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • CoRR

دوره abs/1011.0397  شماره 

صفحات  -

تاریخ انتشار 2010